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There are reasons why teaching behavior should be assessed, including (1) 
upgrading teacher education, (2) gaining insights into the learning of both teachers 
and children, and (3) studying social interactions. Two means of assessing teacher 
ability are quantification of teacher behavior by the use of rating scales, behavioral 
categories, etc, and participant observation (PO). The first, assessment by instrument, 
confounds the effects of too many interacting variables for the instrument to reliably 
represent the effects of teacher behavior. In the PO method, very well qualified and 
trained people are the assessing instrument. Observer judgment and observer 
influence upon the classroom situation are present, but if the observer is well qualified 
and well trained, as he must be for the success of the method, the data obtained 
should be more reliable and more relevant. Filming the classroom situation can also be 
used and adds much to the assessment process. The PO approach was tested on 
selected Head Start and elementary school classes. The data analysis from this 
testing is incomplete. It has been found, however, from a combined PO and filming of 
suburban and inner-city (Hartford, Connecticut) elementary classes, that suburban 
classes are uniformly superior to inner-city classes. (WD) 
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ABSTRACT 



The rationale for participant observation call for a greater reliance on ex- 
perience and training of observers and on systematic profcedures for sample selection 
and inter-class comparisons^ than on the development of a system for directly and 
reliably recording categories or signs of behavioral fragments. Variations in 
teaching and in observation must be analyzed as interdependent sources which both 
contribute meaningful descriptions of differences between classes. Recording 
. samples of observed behaviors is essential for training and analysis. 

Applications using teams of observers in Head Start and inner city and suburban 
elementary school classes are described and discussed with reference to methodology 
and data reduction. Films were made of a stratified sample of classes in order to 
anchor observational reports and ratings and for the purpose of providing primary 
data on stylistic variation across school location and grade level. 



^ "The research reported herein was performed pursuant to a contract with the Office 
of Economic Opportunity, Executive Office of the President, Washington, D.C., 
20506. The opinions expressed herein are those of the author and should not 
be construed as representing the opinions or policy of any agency of the United 
States Government." 

2 Films for this project were made under the resourceful direction of Professor 
Alvin Fierlng. Mr. Charles Kokaska performed extraordinarily in developing 
positive relationships with teachers and in supervising field observers. Miss 
Janet Hudson has indexed films and organized data with consummate skill. 
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OBSERVATION OF TEACHERS AND TEACHING: 
STRATEGIES AND APPLICATIONS! 

Frank Garfunkel 

Boston University 

INTRODUCTION 



All too often educational studies employ a single recording technique to abstract 
teacher behavior Into data. The monolith Is this singular strategy rather than the 
claims and procedures of any one school of observational thought. Such a criticism Is 
not confined to educational research, but to any studies that focus on complex human 
behaviors for which there is no optimal methodology that is accepted by professional 
consensus as being the epitome of validity. Although a particular methodological 
approach - participant observation (Bruyn, 1966) - will be described, the discussion 
of perspective is crucial to its elaboration. The vehicle of inference for participant 
observation is "observer” with experience, training, and theory rather than rating scale, 
checklist or behavioral protocal. In order to comprehend the validity o'f any of these 
vehicles it is necessary to explore their potential diverse contributions and to carefully 
describe defects in instrumentation, methodology and substance. 

Participant observation is not cast as the only or preferred approach, but rather 
as a necessary component of research activity that aims at inferring useful data from 
teacher behaviors. The fact that such a strategy does not result in easily reportable 
and grossly comparable data should not be a deterrrent to its use if there is reason to 
believe that the behavior being studied is so diverse and complex that descriptive 
problems are inherent because of this diversity and complexity. Social sciences (and 
other sciences, as well) always run the risj(; of reporting that which is easy to describe 
rather than that which is important to the phenomena being studied. 



RATIONALE 



Strategies for obtaining data on teacher variation cover a wide range of procedures. 
Quantification is variously based on rating scales, behavioral categories, checklists, 
interaction analyses and projective inferences. Reliability is more a question 
of definition of behavioral units than of their relevance to teacher effectiveness. 

The substance of the behavior that is designated by the observational model 

is a reflection of either the instrument maker *s or the observer’s bias. Whichever 



^ "The research reported herein was performed pursuant to a contract with the Office 
of Economic Opportunity, Executive Office of the President, Washington, D.C., 

20506. The opinions expressed herein are those of the author and should not be 
construed as representing the opinions or policy of any agency of the United States 
Government . " 



is the case, there is always a presumption about educational goals and effective 
implementation. This is just as true of rating scales as of direct measurement which 
must make a prior decision about what is to be observed. It is not clear that 
any extant system is based on a theory which would systematically direct us to study 
particular behavioral categories. 

When explicit attempts are made to empirically judge effectiveness by observing 
changes in children during the time they are with a particular teacher, and, further- 
more, to select units of teacher behavior because of their relation to change, there 
are snarls because teacher effects are engulfed by developmental and social class 
effects and also, and perhaps more significantly, the behaviors that are most directly 
affected by teachers are not easily defined or measured. Achievement tests give an 
abstraction of intellectual behavior which may very well be invariant to teacher 
effects, especially when compared to intelligence and social class variance. This 
is not to imply that there are no teacher effects, but only that given the instru- 
ments and variables conventionally used, for practical purposes, they are not 
measursBle, at least .with the samples of teachers and children that have been used 
in teacher effectiveness research. This is an important "at least" for, as has been 
pointed out i.n psychotherapy research, demonstrated effectiveness of a particular 
therapist or procedure is very much a function of the diagnosis and severity of the 
patient. It is possible and probably that teacher effectiveness studies must take 
into careful consideration the age, sex, and educational-intellectual status of 
students. The teacher variable will probably prove to be more demonstrably effective 
for disadvantaged, disturbed, retarded and generally disabled children than for 
normal children because the variability in criteria is, to a large degree, accounted 
for by independent variables that are constructively and methodologically highly 
correlated. There is a confounding between the research problem - are teachers 
differentially effective? - and the measurement problem, that is largely unresolved. 

While admitting that the ultimate criteria of teacher effectiveness are changes 
in children, it does not necessarily follow that the important teacher variable (or 
variables) should be derived by regressing changes (in children) against a myriad 
of input variables (teacher behaviors) . For this to be the recommended procedure 
it would have to be established that the criteria are desirable and that they are 
meaningfully linked to teacher behaviors, neither of which is definitively so. Research 
on teaching is faced with a forbidding gap between teaching and learning which is 
partly a function of the autonomy of teachers and partly of the nature and limitations 
of teaching and measurement technology. 

Failure to develop a predictive system for determining effectiveness has been 
accompanied by (and partly by default led to) the development of authoritative 
systems whereby one or more profes.jioualr. describe what makes an effective teacher. 

Items, scales or categories are abstracted so that they can be used by a more or 
less skilled observer, to obtain data on the purported effectiveness of a sample 
teachers. Behavioral units can be quite global, encompassing such broad areas as 
permissiveness, warmth, creativity or control, or they can be extremely specific 
relatively nonjudgemental, such as recording the number of times or amount of tlmo 
that particular behaviors and interactions take place. * Global assessment depends 
on trained and experienced observers while specific assessment depends on trained h*jit 
not necessarily experienced observers (experience referring to teaching and training 
referring to observer training) . 
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The construct validity of any more or less global or specfic system vill depend 
on not only the substance of categories or itemp, but on other dlsiderats as well. 

In fact, substance might very well be of least signlflclance in light of situational 
and procedural varlbilitles that are often erroneously assumed to be relatively constant. 
Given the fact that teachers vary, it does not necessarily follow that procedures are 
directly comparable, operational goals are the same, samples of children in different 
classes require the same approach, curricular and time of day variations are Insignifi- 
cant or cultural forces or particular schools are not predisposing. When the burden 
is on the Instrument (rather than the observer) it is difficult or Impossible to 
correct for confounding that is implicit in each of these sources of variation. Given 
Instruments will only be effective to the extent that these intervening variables are 
not only controlled for (presumably by randomization or manipulation) but are measured 
and, it follows, whose distributions are adquately represented in the given sample of 
classes. This suggests that either studies of teaching should concentrate on in- 
tensive surveys of relatively homogenlous clusters of classes that differ on few but 
potent dimensions, or that large scale studies include manipulation of curricular, 
sampling of children, in-service training and supervision. This is to say that there 
is too much noise in the system for any single instrument to validly assess teacher 
effectiveness. This is just as true if the instrument is based on a construct as it 
is if it has been empirically derived o 

% 

Another rather imposing source of variation is the observer both-(*the procedures 
by which he is trained and those that he uses in the course of his obserr’atlons. It 
is not only that different people see different things, but that the conditions of 
training, visiting classes, feedback, and articulation cannot be assumed to be constant. 
The use of a single instrument will not insure comparable data unless either the 
observational process continuously standardized, the instrument has built in 
features which suppress observer and observing contamination, additional data is col- 
lected to provide for necessary nominal distinctions, or the variability in phenomena 
being observed dominated observer variability in a direction consonant with the 
purpose of the data gathering process. 

It follows that no single strategy is inherently superior to another one but 
that there are situatipnal, temporal, economic, and personnel considerations which 
will suggest that one approach will be more valid than another. The reduction of 
teaching behavior is desirable because inference is based on more clearly understood 
judgements. However, reduction can lead to spurious and often trisleading data, if it 
la not accompanied by compatible reduction of other relevant behaviors of teachers, 
children, and schools# Furthermore, the sin quo non of reduction is that the trans- 
formation be reversible. If reduction leads to a collection of irreversible bits that 
cannot be associated with the child's and teacher's other (and more global) behaviors, 
then studies of teaching will leave the domain of education and enter some other 
(possibly meaningful) domain. There are obviously Impelling reasons why teaching shoula 
be validly assessed, not the least of which is upgrading teacher education, gaining 
insights Into learning of both teachers and children and studying social Interactions# 

If reductionisra leads away from these by so abstracting and fragmenting behavior then 
it is likely that it will contribute much more to behavioral analysis than to change. 

The greater the reduction to highly reliable bits of teacher behavior, the moie 
likely it is that accurate predictions will be made of correspondln|fly reduced to bits 
of child behavior. Therefore, if the research goal is to get such corrcspondance , dis- 
regarding Its relevance for teaching and learning, then maximal reduction is to be 
desired. But the reduction process, in general, ignores relevance and only accidentally 
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provides indices for units of behavior that are clinically meaningful. Human behavior 
has not been structured (theoretically) as an accumulation of behavioral bits that go 
together in an orderly and linear model o It is not at all clear that these bits have 
any useful meaning by themselves. It is a pragmatic question that can be dealt with 
only in terms of specified applications which become the guage of usefulness. The 
research decision to concentrate on any given units is germane not only to methodo- 
logical considerations — how is the unit best measured? — but to the theoretical con- 
nection between teacher and learner o This connection can be conceptualized as being 
mapped by any level of cil>s traction or generality. The crucial question arises when 
clinical requirements demand reversibility— that results under any system of inquiry 
be useful as feedback in order to affect behavior other than that which is under a 
microscope. There is just as much need for transfer from datum to person as there 
is from skill to ability. Without this transfer both systems would be sterile. 

Transfer is implicit in a well ordered and predictable system where reversibility 
(from behavior to abstraction to behavior) is generated from an object (Intra) and 
across objects (inter). An individual’s within variability over abilities is re- 
flective of sampling variation across individuals and time and vice versa. The 
Stanford-Binet IQ is reversible (for middle class children) not because we can go 
directly back to the individual from the IQ, but because we can go the sample and 
then, in a meaningful way, back to the individual, "Meaningful way" refers to the well 
ordered system whereby a probabllstic statement can be made about the individual’s 
future academic behavior with regards to the group. Without this characteristic test 
scores or observational data become one way streets that make no useful connections. 

Classroom observation Is up against the reversibility dilemma no matter how 
abstract or reliable are the protocols. When data are obtained they may fit into a 
regression analysis but they cannot be transformed back to the class either directly 
or Indirectly because of the lack of order In the system, either horizontally or 
vertically. Because of this, films (or Kineoscopic tapes) are needed to provide a 
mechanical vehicle for reversibility in the absence of a theoretical or empirical 
Vehicle. Admittedly this only provides for the reversibility; it is not established. 
But at least the possibility exists. At the same time the vehicle for transfer is 
present— various techniques can be applied to the same sample of classrooms. 

Variability of multiple dimensions and strategies can be put to the crude, but im- 
mediate test of Viewer (film) variability. Direct comparisons can be made between 
direct recordings of behavioral bits, ratings n£ qualities, and authoratatlve 
judgements o And, most significantly, teachers can be confronted simultaneously 
with data and behavior. For the present, films would appear; to be necessary for the 
development of any form of observational analysis — without films even carefully ob- 
tained data will be lost to a specific, non-transferable and irreversible "black 
box" process o 

The fact that the introduction of the photographer or the observer transforms 
the situation is not without theoretical interest. If non-reactive procedures can be 
used in educational studies, as was done by Sexton (1961) and as is recommended by 
Webb, et al (1966), they are to be desired unless the reactive effects are theoreti- 
cally Important in the reconstruction of phenomena being observed. There is reason to 
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belleve that the principle characteristic of teaching Is that It Is not observed and 
that feedback Is not existent and, In fact, Impossible. Education Is essentially a 
nonreactlve system which Is unaffected by contemporary social movements, recent 
scientific advances and critical reappraisal of current practices. Authorities set up 
the models and pontdiElcate but teachers and principals run the show In autonomous 
conclaves. This autonomy Is personal rather than professional. Textbook and exam- 
ination conformity Is obviated by variability along Indeterminate and self-defeating 
lines o The model of classroom observer (or photographer) Is one that Involves more 
than an Invasion Into the classroom for the convenience of research. It Is a different 
and more viable model that permits (but does not Insure) a continual reappraisal of 
curriculum and behavior. The study of unobservable teachers Is a paradox without 
resolution. Teaching conceived as art, science, or some combination of the two is 
untenable unless it can be researched on the one hand, or experienced on the other. 

Given the present state of research technology, the falling tree in the forest does 
not make a sound unless there is someone (or something) to hear it. 

Orchestras need listeners, recorders and critics less they exist in an Incestuous 
vacuum. The reinforcement of teachers consists of a bundle of meretricious acts and 
words which contribute more to a religion than a profession and more to a mystical 
epistemology than to a vital language that has some relation to behavior. Therefore, 
the criticism that the observer changes the situation is accepted and encouraged. 

That the necessary research vehicle is just as essential to pedagogy is not a coin- 
cidence. The claim can be made (even If It cannot be rigorously supported) th any 
social scientific techniques should have direct payoff to the individual or groups 
being observea and manipulated. Using film to study teaching is an example of this 
claim. 

Disregarding the technique used to record behavior, observational studies are 
usually confronted by comparisons of teaching that depend on values rather than 
behavior. If comparisons are to be made between teachers who lecture and those who 
lead discussions in varying subject fields, any system of measurement will break down 
unless it is either assumed that one approach is inherently better than the other 
(values) or that the different behaviors are irrelevant to the measurement of effec- 
iveness which is to assume that goals transcent methodology. There are several ways 
around this dilemma. The curriculum and/or methodology can be stipulated (Belleck, 
et al 1966) and teaching can be thusly compared. Unless teachers have opportunities 
for participation in several manipulations there will be teacher-method confounding. 
Manipulation can be contrived (with or without teacher involvement) or they can be 
unobstrusive (and thus really not manipulations) by selecting sequences of comparable 
behaviors that already exist. In either case and disregarding the observational and 
recording technique, there is some control so that "everything being equal" is not a 
completely empty phrase. 

If manipulations of the first or second kind are impossible to accomplish, ad- 
justments must be made either by restricting the field of study or by using an "in- 
strument" that allows for diverse methods, curricular and samples. Such an "instrument" 
might be a series of conditional scales which are selected by the observer depending 
on the curriculum and techniques being used. Comparisons could be made on those scales 
that were selected a sufficient number of times. The "instrument" could also be a 
highly trained and experienced team of observers who have necessary skills to compare 
somewhat dissimilar teaching situations. To assume, as is often done, that the observer 
who has the task of selecting and judging, will be more subjective than a series of 
protocols that cannot deal with the complexities of teaching variance, necessarily In- 
volves the tautology that such an observer is definitively subjective, and direct be- 
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havioral recording and rating scales are definitively objective. This fallacy is an 
inheritance of the so called "objective" test which is presumed to be objective be- 
cause of its format, not because of its item selection, mode of inquiry or reactive 
effects o Admittedly, the scoring process is less subject to the biases of the scorer 
and the paper and pencil standardization conditions of test administration are 
relatively constant, but this does not provide sufficient conditions for objectivity. 
Reliability is an aspect of what might be referred to as internal objectivity but it 
is not necessarily primary o It is necessary to consider the effect of the instrument 
on not only the subject but the educational process, the selection of items, the mode 
of item presentation and the problems Inherent in the transformation of behavior to 
data* The high reliability of "objective" tests is not without a price in external 
subjectivity. The assumption that reliability is generic to validity has already been 
challenged with regard to "objectivd* testing and it can be similarly challenged with 
regard to "objective" recording of teaching behavior. 

The argument is the sameo The selection of items and modes of presentation in- 
volves gross subjectivity even though recording and scoring processes (which can be 
one and the same) are highly reliable o This is not to say that essay tests and the use 
of the observer-as-instrument necessarily insure external objectivity but only that 
they provide an alternative strategy which can more directly get at higher level 
processes. Thinking, reasoning, problem solving and creativity may be vague but they 
come closer to the expressed goals of education than memorizing, recalling, and 
educated guessing <> Similarly, the assessment of humane, creative, elaborative, in- 
sightful, and Intelligent teaching is more directly to the point than counting the 
amount and number of times teachers and students ask questions, make statements, 
make demands, and are silent. This is not to preclude that specifically defined be- 
haviors can be important Indicators of generalized functions but only to gain per- 
spective about their limitations and the value of alternative "subjective" strategies 
to approach a more profound objectivity than is to be had by using "objective" 
methods exclusively. 

The question of reaction is not a trivial methodological issue that can be re- 
legated to vagaries of research® The teacher who is "counted" and the observer who is 
counting are part of the system and will respond in some way to this procedure as 
opposed to an alternative one. The reductionlsm Involved in "counting" reduces not 
only behavior, but the work and status of the observer and, therefore, of observational 
process o This is not a polemic for eliminating "counting" but rather an argument 
for questioning any reactive procedure, not because it is reactive, but because of the 
quality and force of the reaction it might evoke. 



",ENERAL statement OF PROCEDURES 

We address ourselves specifically to the problem of evaluating and describing 
the potential effectiveness of teaching in a diverse sample of classrooms and schools 
(or centers) • Amount of observation will depend on sample vaiabillty and 0 sophistica- 
tion « In order to obtain approximations of these parameters the design calls for 



2 This followed Campbell and Stanley’s (1963; distinction between internal and external 
validity® 
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multlple 0*8 making multiple observations of classes over an extended period of time. 
0*8 will have had teaching experience and will participate in seminars prior to and 
throughout the POo Training will consist of a variety of experiences aimed at facili- 
tating inter-0 commiirication, becoming familiar with a behavioral model and developing 
observation sensitivity. Seminars and workshops prior to PO will be used to screen 
out unsuitable candidates, 0*s will participate in an observational seminar where they 
collectivly observe groups of children in classes and discuss at length, teaching and 
learning as they view it. 0*s will observe each other teaching children and discuss 
varieties of approaches and values o 

Films will be utilized in the observational seminar in order to allow for review 
of discussed behaviors at any time. These films should show diverse teachers doing 
similar tasks and similar teachers, or a given teacher, functioning in varying ways. 

It is desirable for 0*s to view different teachers with the same group of children. 

0*8 will keep careful logs of observed behaviors which will provide detailed 
accounts of teacher, child* and Interactional behaviors o Analytical reports will be 
written, utiliaing the log as sources of evidence. Finally, 0*s will write inter- 
pretive summaries of teachers and classes, describing their estimation of effect- 
iveness and Indicating teaching characterisitics that are critical for their assess- 
ment. Procedures for writing these reports are set forth in greater detail in the 
appendix to this report o 

Scales representing important and adequately variable dimensions of teaching 
and child behavior will be constructed in such a way as to relate the observed be- 
havior to the behavioral theory. 0*s will 0-sort classes on each of these scales- 
rating all classes on one scale at a time thus minimizing associational biases. 0*s 
will underline and label logged behavioral recording according to a notation that 
related scales to specific recorded behaviors o Scaled judgements can then be sup- 
ported by molar sequences of observed and recorded behaviors. 



MODEL 



Although participant observation (PO) varies as to thfe specific procedures used, 
it is always based on the principle that although the observer (0) will adapt pre- 
conceived structural outlines and dimensional scales on the course of his summary, he 
is the instrument for inferring data, rather than any outlines or scales. There must 
be enough intensity and duration in the involvement with the phenomena being studied 
for its unique structure and process to be indentlfiableo The amount of contact is 
a function of the kind and degree of distinctions between individuals and agencies that 
are required. Once the target system is defined 0 has the responsibility of determining 
a traffic pattern for himself which will lead to an understanding of relationships and 
directions Hypotheses are constructed by relating a presumed general theory of behavior 
to the behaviors of tlie systemo PO methodology Is independent of the theory or of the 
working hypotheses— but some articulated theory is necessary. 

0 is presumed to be experienced and trained although specifications for both de- 
pend on task requirements. Training can be presumed from the previous experience of 0 
or it can take place prior to and during PO Reliability will depend on the perspective 



and sensitivity of v/ and multiple 0*s can be used to provide anchoring if diverse 
si^tuations are to be observedo 0 will observe and become involved (interviews, utlli~ 
zation of unobtrusive data, manipulation) to an extent necessary to test hypotheses 
about prediced outcomes and structural relationships o Guidelines for participation 
must be drawn up, prior to observation, with the cooperation of individuals involved. 

Biases of 0 must be continuously dealt with but this will depend on whether 
they are a legitimate source of errors Where 0 bias will produce variation equal 
to or greater than phenomenolpgical variation , it is necessary to articulate and 
hypothesize about bias x behavior Interaction in a manner suggested in general tepns 
by Mydral (1953) . Where bias is of rainlal important (as in many cultural anthropor 
logical studies) it need be only articulated. 

Just as in any data gathering process, inferences are only as strong as the 
Instruments that are used. PO depends on high quality 0*s who can demonstrate their 
perspicacity by being able to predict interactions and circumstances and to relate 
observed behavior to given theoretical models. Proof of quality can either be left 
to the readfeTif of final reports or it can be currently brought into relief by using 
multiple 0*s with parallel systems. The test of effectiveness or precision is 
clearly not a reliability coefficient or an ratio. Any such statistical test 
works smoothly once the data is obtained and disregarding the validity of the data. 

PO emphaslzep letting meaning speak for itself in much the same way the Skinnerians 
proclaim that data should be directly recorded and then speak for itself. 

The assumption of PO Is that there are 0*s and methodologies which can be used 
to obtain data that reveals more about observed processes than about 0*s. Method- 
ologies can be designed to efficiently utilize O's with given de'grees of competing 
biases and with specified goals with reference to designated behavioral systems. This 
is to say that deltign will have to be adapted for known variations in O's, goals and 
systems . 

PO is not clearly defined methodology that is uniformly used in the social 
sciences. The practice of having an 0 look closely at a segment of interpersonal 
(or individual) behavior Is slmplemlhded and elementary. Where more clearly defined 
procedures are appropriate they should certainly be used^ The designations of 
adequate O's is difficult and perhaps, often impossible » It might appear that PO 
is a regression to pre-sclentific methodology, where uncontrolled iudgements are com- 
bined with unknown weights ^ But lie is even less scientific to use "powerful" instru- 
ments to perform tasks for which they are unsuited. The decision to use PO is made in 
light of the complexities of teaching, the difficulties of obtaining comparable samples 
of behavior, tne problems of irreversibility, the tenuousness of child behavioral 
criteria and the obscurity and ineffectiveness and inappropriateness of personality 
measurement for obtaining adequate measurements of teacher characteristics. This could 
lead to the abandonment of such research or, as in the case of PO, to the adaption of 
relatively crude processes which can, albeit subjectively, deal with those obstacles. 
Developments in audio-visual technology will make it possible to give more substance to 
the inferences of O's and to provide reasonably direct documentation of classroom 
processes that can be exposed to more verified procedures. 




APPLICATIONS IN HEAD START AND ELEMENTARY SCHOOL CLASSES 



Applications of modified participant observation approaches were made on selected 
Head Start and Elementary School classes in connection with two projects, which were 
taking place concurrently. The first involved twenty Head Start classes which were 
being evaluated by the Boston University Head Start Evaluation and Research Center as 
a part of its participation in the National Evaluation Program. The second was with 
Project Concern, an experipifental study of the effects of suburban education on inner- 
city children in and around Hartford, Connecticut. Since the data on both of these 
projects, with regards to the tested and observed performances of individual children, 
has not been made available, this report is necessarily Incomplete . Procedures for 
observing classes and obtaining data will be described in some detail, and prelin^inary 
descriptive statements will be made with regards to dimensionality of scales that 
were used in each investigation and agreement between raters on a variety of scale 
ratings. In addition, for the Project Concern application, the division of classes 
into inner-city defacto segregated and suburban unite with one, two or three bussed 
negro children in them permits a straight forward comparison over location of classes. 

Although the general principles behind participant observation, as developed by 
Bruyn ( 1966) were followed in the development and carrying out of procedures , th e 
sustained and intensive contact of observers with classrooms and schools wasxf^o^ f 
followed partly by choice, because of the kinds of variation that were of mostT^erest, 
and partly by necessity. Future studies will provide for considerably more contact 
between observers and the institutions they are observing in order to tealize the depth 
which is only being approximated by procedures to be reported herein. 

The aims of these studies were twofold: 1) to study the relationship between 

selected characteristics of teacher style and changes in mental abilities, academic 
achievement, personal-social development and creativity of children in selected 
classrooms; 2) to describe, through cross-sectional procedures, teaching situations 
which Head Start children are exposed to and those to which they will most probably 
be exposed to if they attend inner-city or suburban elementary classrooms. 

PROCEDURES 



Both applications called for the recruiting and training of observers who had 
extensive experience both as teachers and as observers of preschool and elementary 
school classes. Initial training sessions involved observation of classes and 
discussion of an all-inclusive categorical model of classroom procedures (Appendix C) , 
This model was not for the purpose of providing a checklist or of focusing observers* 
attention on particular variables so much as it was for directing their attention to 
all possible contingencies and teaching situations. The model included listings, under 
the general heading instruction , of materials, lessons, motivation, evaluation, and 
achievement, A second section under the general heading of controls included form, 
quantity, tone, consistency and student pressure. Facilities listed characteristics, 
and implications for teaching. Student interaction included opportunity character- 
istics, A last category, teacher-student interaction included humor, address, feelings, 
reinforcement e This model was meant to be a vehicle which would serve to provoke 
discussion and generate questions about varieties of teaching experiences. In addition, 
an exhaustive list of variables associated with teachets, students and curriculum was 



constructed, through the deliberations of observers, in order to sensitize them to 
differences between independent, intervening and dependent variables (Appendix F) . 

It is critical to note that the models developed from observational seminars and 
were, therefore, the produce of the efforts of observers. They were not handed 
listings of categories and variables which had been developed externally and which 
would have been, therefore, imposed upon themo 

Observers were asked to keep detailed no^es on their observations without 
regard to a particular model, but with specific regard to what they considered 
to be the most important characteristics of the classrooms they were observing. 

These notes were to be transformed into process* reports which were to be concluded 
by analytical reports and summary interpretations (Appendix D) . 

% 

Scales were developed for both studies by observers after carefully and 
deductively describing contrasting characteristics of teaching situations which 
observers judged as being relatively unique. (See appendices A&E) o The scales, 
are, therefore, a reflection of differences seen by observers, rather than the 
basis for making distinctions. This meant that this approach to studying teaching 
involved a concomitant study of observer vdtlation, and that these two separate 
focuses were mutually interdependent. 

The burden of responsibility was clearly on observers rather than scales and 
it called for an inferential process which would be only as defensible as the 
perceptiveness and intelligence of the observers permitted » This process structures 
a systematic approach to dealing with subjective impressions of observers who are re- 
quired to defend these impressions in the face of careful scrutiny by other observers 
and by senior members of the project staff. The process assumes that each observer 
has enough experience and Insight to be able to produce salient reportia and inter- 
pretations of teaching variation. Resulting inference must attend to both sources 
of variation — teaching and observing— in order to adequately describe stylistic 
variation within stylistic categories. 

In order to provide a superstructure for teaching and observing varlat/Lons, films 
of selected classes were developed. In covering a wide range of activities, these 
films have and will continue to provide referent behaviors for the reports and ratings 
of observers. Extensive use of these films has been and is continuing to bss made in 
order to clarify reductions of behavior that were made by observers. 



OBSERVATIOI? OF HEAD START CLASSES 

Of the twpnty sample classes used in the National Evaluation program, nineteen 
were observed sufficiently by two or more observers to produce reports and ratings 
on a series of scales which were constructed by observets during the course of their 
observations. 

Eight scales were used in rating nineteen teachers by six observers, with each 
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teacher being rated by two, three, or four separate observers o 

The scales were as follows: 

lo Attitude towards teaching situationo 

2o Teachers differentiation of children and activities. 

3. Predominant emphasis of curriculum o 
4o Purposefulness of classroom behavior o 
5o Control of materials and interactions o 
6 o Commmication-responsiveness . 

7» Work-play continuum® 

8o Overall ratings 

The detailed statements about each of these scales were given to each observer 
and can be found in Appendix Ao 

Rater agreement on the ten scales varied between 80% and 90% and on the overall 
rating the agreement was 92% « Interscale correlations varied between ®60 and .90. 
Variation betwtsen classes appear to be sufficient to allow for maximal rater agreement 
as well as the probable inflation of scale inter-correlation. 

Observers were instructed to sort all teachers on each scale, rather than rating 
each teacher on all sca]\es. In order to minimize halo effects® 

Since four of the six observers had training and experience in early childhood ed- 
ucation and, consequently held a point of view which valued highly differentiated pro- 
grams with a considerable amount of freedom for individual children, resulting ratings 
are necessarily a reflection of this point of view and are, therefore limited in their 
generality® Observational teattts that participate in such a strategy should represent 
a wide spectrum of points of view with at least two observers representing each major 
variation. Similarly, it is essential to obtain samples of classes where competence 
and style are relatively ° independent so that their respective sources of variance can 
be partialed out® 

PROJECT CONCERN : Comparisons of inner-city and suburban classes. 

Project Concern is a large scale interventional project which provides for 
educational placement and supportive services for 250 inner-city children. The inner- 
city children are all residents of Hartford, Connecticut, and the experimental inter- 
vention consists of placement in surrounding middle class suburban schools. A randomly 
selected control group of 250 children is being studied concurrently in order to test 
hypotheses regarditgthe differential effects of inner-city and suburban schools on 
children. A summary of the theoretical framework and the experimental design of 
Project Concern can be found in Appendix B. 

The Boston University Head Start Evaluation and ReseaUch Center has been involved 
in observing and filming a random sample of classes that contain experimental and . 
control children® Observations have also been made on a sample of Head Start classes, 
so that educational continuity between Head Start and elementary school could be 
ascertained. Filming took place within a careful observational survey design so that 
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the validity of the filirilng process could be evaluated. 

From thlrty-n?.ne schools involved in Project Concern, thirty-eight classes were 
selected for the observational and film survey. Ten of these classes were filmed over 
a five-month period o The extent to which filmed behaviors of particular classes repre- 
sent those classes, as well as the extent to which the film '’classes are representative 
of all classes, Is presently under careful consideration. Findings thus far are that 
Independent observers can go from films to reports and from reports to films with equal 
facility and that ratings of films ane in almost complete agrecsment with observer 
ratings made of filmed classes at other times during the year* 

Both the observational and film survey included kindergarten, first, second, 
third, and fifth grades in both inner-city and suburban schools Inner-city classes 
were selected randomly (stratified on grade) from the total pool of control classes. 
Suburban classes were selected randomly from two communities that had greatest partici- 
pation in the project and that represented more and less cooperative coomunities with 
regards to Project Concern o 

The observational team consisted of five observers with widely different back- 
grounds and points of viewo They were trained, respectively, in preschool education, 
elementary education, elementary and special education, secondai^ and special education, 
and elementary education and counseling. Each observer was randomly assigned a sample 
of classes in both inner city and suburban schools. They were required to make at least 
two extended observations, separated in time by at least one week, and, preferably, three 
or four separate observations . In addition, each observer was required to observe 
classes of two other observers at least once and, preferably, twice eacho 

Observers wrote process and Interpretive reports and ratdd each class on ten scales 
that had been derived by the observational team from preliminary observations of the 
total sample of classes. A sorting technique was used so that a given tatjSr would 
focus on inter-class variability over each scale, rather than within class variability 
on all scales. 

Scale derived areas follows; 

1. Involvement and interest of children 

2. Piitposeful behavior of class 

3. Source of direction of academic activities 

4. Nature of control over behavior 

5. Effectiveness of behavioral controls 

6. Quality of presentation of subject and materials 

7. Differentiation of instruction 

8. Teacher reaction to classroom situation 

9. Reinforcement of behavior of children 

10. Nature of reinforcement 



With the exception of scales five and nine, there appears to be a general factor 
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which differentiated teaching in observed classes. Intercorrelations between scales 
ranged between .60 and .80 and the internal consistency of the scales is well documented 
across all observers by scale--total score correlations of .80 and .90 with the exception 
of the two scales mentioned. Rater agreement on individual scales, with the exception 
of scale 9 varied between .50 and .60 and rater agreement on the cumulative mean rating 
that was made by each observer on each teacher was correlated .65. 

There are important differences between raters as is reflected by their respect- 
ive interscale correlation matrices. For two of the raters, the interscale correlations 
were generally between .40 and .60, while two of the other observers had interscale 
correlations between .75 and .85. Subsequent data analyses which are aimed at estab- 
lishing differential effects within suburban and inner-city classes will treat observer 
score matrices separately in order to access the validity of different observational 
points of view with respect to predicting change in diverse educational settings. 

Data obtained from scales was unequivical in showing suburban classes to be 
uniformly superior to inner-city classes. Seventy-five «i.percent of the suburban classes 
were above the median and seventy percent of the inner city classes were below the 
median which was highly statistically significant on "t" test. 

Differences between inner-city and suburban classes were statistically significant 
on all scales except 5, effectiveness of control ; 9, reinforcement of behavior ; and 10, 
the nature of reinforcement . 

Thus, observational ratings clearly distinguish inner-city and suburban classes 
on selected scales and on mean rating over all scales. However, 30% of the classes over- 
lap, five suburban classes being below the median and six inner-city classes being 
above the median. 

These observational data will be used in order to modify the prediction of change 
in inner-city and suburban classes in order to determine whether high quality (as here 
defined) classes in inner-city schools are associated with changes in children in high 
quality classes in suburban schools and, similarly, whether low quality instruction 
in the suburbs is associated with low quality instruction in the inner-city. 

DISCUSSION 

This carefully structured observational survey demonstrated the degree and kind 
of difference that is manifest be'tween inner-city and suburban classes. This is backed 
up by a film survey of selected classes, kindergarten through five, in inner-city 
and suburban schools. There is a close correspondence between filmed behaviors and 
those that are reported in the data analysis of the scales used by observers. In both 
cases it is apparent that inner-city schools are characterized by relatively uninvolved 
children, classes with extremely restricted purposes and teachers who tend to per- 
vasively control materials and children. This control is often expressed as coercion 
and threats and is accompanied by a rather pedestrian presentation 6f materials with 
relatively little differentiation of instruction. Inner-city teachers appear to enjoy 
their teaching less than suburban teachers. These differences are quite apparent in the 
films, which are presently being prepared for showings at several national conventions. 
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Xnncr City and suburban classrooms will be displayed simultaneously on two adjacent 
screens in order to bring these comparisons into relief. Films have been subjected 
to detailed analyses in order to refine scaler differences. Films of the inner city 
and suburban classes have been combined with films of Head Start classes in order 
to specifically and objectively present a cross sectional longitudinal comparison 
of the experiences that children have in preschool, kindergarten and through the 
grades. The films vividly portray the contrast between selected Head Start and selected 

elementary school classes. 

All filmed sequences have been coded according to a curricular scalematic de- 
vised by Garfunkel (1967) which identifies activities according to curricular 
classification (activity, substantitive or routine) , substantitlve or activity 
category (construction, performance, play gratification, language, social science, 
snacks, clean up or rest), process focus (mechanistic routine, skill, perceptual, 
cognitive or social) and control (teacher or child dominated) . Each sequence is 
also rated on the scales developed by observers. This allows for matching of con- 
trasting curricular and stylistic sequence across and within location (inner-city- 
suburban) and grade level (Head Start and Kindergarten through Grade Five). Further- 
more, It provides a basis for comparing filmed sequences on ten classes to observed, 
recorded and rated behaviors in 38 classes which were selected by using systematic 
and random siampling procedures. The validity of the films is, therefore, based both 
on techniques and methods of selecting classes and filming then, and analytically, 
by obtaining comparable data on films of a limited sample of classes and anecdotal 
reports and ratings on a representative sample of both inner-city and suburban classes. 

Prelimina..*y findings from these studies document wide variations across 
Head Start inner-city and suburban classes. The obvious next step is to follow children 
who have been exposed to certain styles of teaching and to compare their responses 
to elementary schools that offer similar and contrasting classroom environments. This 
can serve as a control for predicting how high and low changes on various measurement 
procedures will respond to continuous and discontinuous learning environments. Of 
particular Interest will be the interactions between Head Start and elementary school 
stylistic variations on selected measures of achievement and social-emotional be- 
haviors . 
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HEADSTART EVALUATION AND RESEARCH CENTER 



Boston University- 

Scales for Rating Participant 
Observational Reports of 
Headstart Classes 



1. Attitude towards teaching situation 

This scale is specifically aimed at a judgement of whether the teacher enjoys 
the teaching situation and not whether she is a good teacher or whether the observer 
likes her. At the high end of this scale such adjectives as happy, pleased, exhil- 
erated, joyful, and so forth'.. At the low end of the scale, unhappy, miserable, 
sad, pained, and so forth. The judgement revolves around what the observerrsees in 
the behavior of the teacher and not a projection by the observer as to whether 0 
would be happy doing the things that the teacher is doing. This, as well as other 
judgements, will depend upcm evidence that is collected in the course of observa- 
tions,* and it should be possible to sight that evidence. Therefore, it is the- 
oretically assumed that the total behavioral protocal is reducible in such a way as 
to provide bits of evidence to support each scaler judgement. Without such 
reducibility , the judgement becomes simply a "gut reaction." While admitting 
that the "gut reaction"iis- an important part of perception and judgement, the 
process of collecting evidence and making judgements should force the observer to 
look deeply into his reaction and to make essentially two judgements: the first 

one bfeing whether or not he can make a rating, and the second being conditional on 
an affirmative response to this. The condition of being able to make the rating 
will always depend upon the articulation of evidence to support a given judgement. 

2. Teacher’s differentiatiation of children and activities 

At the end of the scale we have a teacher who runs a class that has a high 
rating of individual instruction and who does not make demands upon groups of 
children to do the same things at the same time. High differentiation would involve 
either one of two strategies: A,) where there is a special plan for each child 

depending upon his abilities and attitudes and b.) where each child is allowed to 
go his own way and to seek out his own kind of activity and activity level. Low 
differentiation would be evident by a preponderance of classroom activities which 
involve all children. It does not ;fdUow from this that this scale will necess- 
arily correlate with good teaching or poor teaching, but that it represents a 
style of teaching with respect to dealing with individual children of groups of 
children. 

3. Pxedominent Emphasis of Curriculum 

This is essentially a nominal scale which calls for a judgement on the part 
of the observer as to which of the categories suggest the principal manifest goals 
of the activity bfeing observed. The extent to which these categories are ordinally 
related depends upon a presumed value system with regards to desired goals of 
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preschool teaching. The categoties to be used in this scale are taking directions, 
cognitive, perceptual, social emotional and a fourth category, unclear, which in- 
dicated that no single emphasis can be inferred from observed activity. The judge- 
men4 of which category a given sequence of behavior be^tongs to, will depend upon 
the behavioral priority system that operates for a given class. For example, if a 
given lesson or period appears to be dominated by cognitive training but if the 
behavior of the children cause changes in plans and Redefinition of the program, 
then cognitive would be viewed as being a secondary goal, and the kind of activities 
which cognitive training give way to would be the primary designation. It is essen- 
tial that we observe classes closely and long enough so that we can make inferences 
about what the goals, in fact, are, rather than what they are said to be. Freeplay 
periods might be dominated by something like the learning of routines and/or 
language training. Perceptual training might very well be dominated by social/ 
emotional considerations if the behavior of the children causes the teacher to shift 
the emphasis for individual children, ffom. time ’, to time,., * .As 'has* be’en stated *for 
the other scales, it will be necessary for observers to present evidence for manifest 
goals and to distinguish between the nominal categories of this scale and overall 
judgement of effectiveness. A good deal of work will have to be done on this scale 
so that it presents the observer with a series of branbhing scales with alternative 
categories, but with a theoretical connection between the different branches. 

4. Purposefulness of classroom behavior 

An affirmative response to this scale will depend upon clear evidence of 
direction and continuity. One would expect to find a considerable amount of 
observer disagreement over this scale because this is particularly subject to 
whether or not the observer is in harmony with the teacher and is able to see the 
underlying goals of the class as it evolves. In order to rate a teacher as being 
purposeful and the class as being purposeful, it will be necessary to show 
evidence for continuity and direction; and similarly, inorder for a teacher to be 
rated as being not purpohwful, it will be necessary to point out discontinuity 
and to show many apparent shifts in direction during the; course of observation. 

5. Control of Materials 



The question here is not so much whether it is the child or the teacher but, 
rather, whether the child has a say in eiqher the gross selection of activities or 
materials or in their use after they are selected, or whether the teacher dominates 
both selection diid use'.lu * ' • 

6. Communication — Responsiveness 

This question is directed at the class and raises the issue of whether, 
whatever is going on in the class, there is great responsiveness to it on the 
part of the children or are they largely unresponsive or indifferent and, if any- 
thing, following through on routines rather than being responsive to activities 
and to the teacher. Responsiveness is indicated by a large amount of verbal and 
non-verbal communication, but it does not indicate that this communication is 
constructive or destructive or that it is good or bad. 
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7. Work and Play 

At the high end of the scale, work and play are undifferentiated and the 
teacher makes little attempt to label or construct activities as being work or 
play, but, rather, they tend to meld together. At the low end of the scale there 
is a clear distinction-”certain activities are presented as play activities and 
others are presented as work activities o 

8. This scale is for a total "gut reaction" to the teacher, class, and children; 
and it asks the observer to indicate, without any great demand for evidence, that 
he thinks a given teacher is more or less effective o 

All>iOf these scales are intended to get at ordinal distinctions between a specified 
sample of teachers that a given observer has been assigned. All judgements are 
necessarily comparative,^ arid they will depend upon what observer has seen as a 
part of the observational task. It is the job of the designer of the sample to 
make sure that each observer has a fair distribution of teacher variability in his 
sample and, furthermore, that this variability is not highly skewed. This means 
that the assignment of a sample of classes to a given observer must be proceeded 
by enough observation to provide evidence for gross variability within a given 
sample of teacher. Samples for observers should have relatively homogeneous 
variance. 
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October 10, 1966 

PROJECT CONCERN 

T. W. Mahan, Project Director 

Brief Summary of Theoretical Framework 



PROJECT CONCERN, although directly related to the problem of ^ facto segregation, 
is not essentially an experiment in integration; rather, it is an experiment in 
educational intervention designed to aomteract the limited influence of urban 
education on the disadvantaged. Research has described the "cumulative deficit" 
which the child from the low socio-economic environment tends to exhibit in his i ^ 
school performance --a phenomenon which is dramatically accentuated among the non- 
white poor — and has underlined the profound task involved in reversing the trend, 

A review of the literature quickly communicates the impression that the problem 
goes beyond special teaching techniques, enriched materials, and better programming. 

PROJECT CONCERN'^will be evaluated by measured changes in pupil behavior. Nonethe- 
less, it is important to outline, at least in skeletal fashion, the theoretical 
base from which these changes are predicted. Basically, the research stems from 
a conviction that changes in stimuli, environment and other input data can result 
in changes in response or output behavior. However, it also felt that cognitive 
patterns for copying with formal learning situations and the affective responses 
which accompany these patterns have been well crystallized at the time of school 
entrance. This results in the use of traditional response p.'tterns which, for the 
disadvantaged, are frequently ineffective for school goals To counieract this 
established tendency it seems best to present the subject with an intense and 
pervasive experience in a tadically different environment so that new responses 
can be provoked. This is the first stage of PROJECT C0NCERN--to create some dis- 
sonance within bhe pupil in terms of his usual perception of himself in relation 
to school and to take advantage of this period of flux by reinforcing positive 
behaviors and attitudes. 

The second aspect of the intervention model is tied to the influence of peers as 
a basis for the development of role fulfilling behaviors. By placing a limited 
number of inner city youth (about 10% of the classroom population) in a suburban 
classroom these same youth will be constantly in contact with models of behavior 
more in keeping with school values. By limiting the impact of models which 
reinforce the current, ineffective behavior and emphasizing the impact of different, 
but reasonable consistent models, it is hoped that some "shaping" of the pupils' 
learning styles will take place in the direction of increased academic performance o 

As a catalyst to prevent too much dissonance which might create a withdrawal 
and/or rejection reaction, significant adult figures who share much of the child's 
heritage but also exhibit the desired characteristics in terms of attitudes toward 
school and learning are provided in the supportive team. The effectiveness of this 
additional factor in the change process is a focus of the research design and, 
hopefully, evidence will be available at the termination of the project to determine 
the dif feuential limpact of the learning environment as separated from the impact of 
adult identification figures. 



In essence, PROJECT CONCERN focuses around the change in perception, already to a 
large extent stereotyped, which can be accomplished by a confrontation with experi- 
ences highly charged with novelty but also in a context of interpersonal support. 

It is predicted that changes will take place and that they will take place in the 
direction of the models which the suburban youth present to the bussed pupils. 
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experimental design 

PROJECT CONCERN is designed to 'determine the relative effectiveness of a radically 
diffenent educational environment as a preventive and corrective intervention in 
the education of urban youth from the inner city. The theoretical rationale for 
the position has been discussed above, but the pragmatic aspects must be mentioned 
briefly here. The ’’vacant seat” for pupil assignment has resulted in considerable 
variability in the placement with some classes having only one experimental S while 
others have four. This in turn has created a situation which results in the experi- 
mental Ss bein^ .’spreddoacross ' thirty-three i(j,33) schools while control Ss are drawn 
from six (6) schools. Hopefully, this diversity will have a sfelf-cancelling effect 
which will underline the impact of the experimental variable - the treatment pro- 
cedure. In this same regard, it is also important to stress that the Experimental 
Ss not receiving external supportive services are all placed in one school system 
(6 schools) and that generalizations from their performance must be made with that 
fact clearly in mind. 

Nonetheless the designi seems adequate to examine the relative impact of four (4) 
methodologies joh'ithe^’leaMiri’gfiatpttddei^daadfmotivatiiiofls ofoinnef-dlcycyouth. 

These methodologies, in order of their predicted effectiveness, are as follows: 

1) Placement in a suburban system with supportive team assistance. 

2) Placement in a suburban system without supportive team assistance. 

3) Placement in an inner city school with supportive team assistance. 

4) Placement in an inner city school without supportive team assistance. 

Ss assigned to treatment procedures one (1) and two (2) above are considered to be 
Experimental Ss since they are subject to the impact of the major variable under 
study: placem^ent in a radically different educational environment. Ss assigned 

to treatment procedures three (3) and four (4) above are classified as controls. 

As described above all Ss were drawn from the same population in a random fashion. 
Schematically, the design is as follows: 
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The criterion variables which will serve as basis for evaluating the effect of the 
treatment variables (suburban school placement and supportive team assistance) can be 
grouped into four (4) general headings? 

a) Mental Ability 

1. Wechsler Intelligence Scale for Children 

2. Primary Mental Abilities 

b) Academic Achievement 

1. Reading 

2. Listening 

3. Arithmetic 

c) Personal-Social Development 

1. SociometriccStatuBO 

2. Test Anxiety 

3. Attitudes 

4. Teacher Ratings 

5. School Attendgneg 

6o Vocational Aspiration 

d) CCreativity 

1. Picture Completion 

2. Circles 



These data will be collected at four points? September, 1966,; as a^base ; May, 1967, 
to evaluate effects after one year; September, 1967, to assess loss dfiring the summer; 
May, 1968, to evaluate effects after two years. The basic statistical tests to be 
used will be analyses of variah'ce'' and' covariance . f'^Ailwda£av(5rillb|)gnaiialyzedcfp^‘ they 
g|e thge^ollo^^lggpy^f 4«b-les w4th^.theypxim§ryn*a^li0bleshGagei^oSielr, sgsrade- 1 . 
placement, school system, and where the N permits, school. 



In addition, case study materials reported on a weekly basis by teachers will be 
utilized in an attempt to discover patterns of growth and development o Along with 
this approach there will be data collected which will indicate parental involvement 
and attitude as well as neighborhood reaction to a child's placement in the suburbs. 
It is anticipated that there will he significantly greater growth for the Experi- 
mental Ss as a group, but it is also hoped that evidence as to most productive and 
effective intervention for pupils with differing characteristics may be revealed by 
careful manipulation of the results o 



The techniques described above will be employed on the total samples o However, 
it is expected that smaller samples drawn from these samples will be used to 
study other areas such as speech improvement, frustration tolerance, and personality 
variables. The major outcomes of the Project will be evaluated from this design 
framework by means of the following specific hypotheses stated here as predictions, 
for operational purposes, a "statistically significant difference" shall be defined 
as a deviation of such magnitude that its likelihood of occurring by chance does 
not exceed one in twenty o 
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1) Experimental Ss will have significantly greater gain scores 
than control Ss in: 

a) all measures of mental ability 

b) all measures of academic achievement 

c) all measures of cognitive flexibility (creativity) 

2) Experimental Ss will show significantly greater decrease than 
control Ss in measures of: 

a) general anxiety 

b) test anxiety 

3) Experimental Ss will not differ significantly ^rom confroloSsc 
in sociometric measures of: 

a) acceptance by classroom peers 

b) acceptance by neighborhood peers 

4) Analyses of teacher report data on Experimental Ss will show 
a pattern of sequential responses which follows the following 
trend for Ss who show significant gains in academic performance: 
uncritical acceptance by the teacher; more realistic appraisal 
by the teacher, but with a tendency to emphasize assets; a 

' G„ 'tendency to recall and report successes and achievements; 

attainment of a plateau in terms of reporting pupil behavior 
as being relatively unexceptional and consistent. 



Appendix C 



Category 



Examples 



INSTRUCTIONS 

A. Materials 

Characteristics and amount teacher prepared, commercial student 

prepared 

Content--specifically , the amount, nature, or characteristics of 
topics related to urban environments or problems. 



B. Lessons 

by teacher or student and the amount. 

"What would have happened if there were 
no Civil Way*. " 

Does the teacher allow students to 
introduce or follow issues that may lead 
away from lessons? 

Does Tc allow asides, immediate student 
reactions, etCo during lessons? 

Does T. call on all students? Do faster 
ones dominate? Are slow ones encouragwd 
and given a chance? 

Amount of individual reading, board work, 
participation. 

*What are the project student's reactions during recitation? How much 
participation, attention, cooperation? 

C. Motivation 

Origin teacher, children, a combination through 

some form of theme o 

Pursuit Does T follow children's ideas, accounts 

even fantasies? 

Charac.'l^eristics What is discussed? How is the environment 

utilized? 



Interpretation 
Deviations within lessons 
Spontaneity 

Opportunity for Participation 
Individual Participation 



D. Evaluation-Achievement 



Type 



tests, oral statements, displays of student^ 
works. (Are project students works 
displayed?) 



I 





II 



Category 



TEACHER-STUDENT INTERACTION 

A. Humor 

B. Address 

C. Feelings 

D. Reward -Punishment 



Examples 



Does T utilize humor to include students as 
opposed to ridicule. 

How does she address individuals or the 
class? "Boys and girls." "Students." 
"Children" Last names-- first names. 

Does she express or discuss her own feelings 
and attempts to elicit those of the students? 

How does she express her favor or disfavor. 
"I'm proud of you." "I like obedient 
children." 



^Examples of specific interaction with project students. 



I 
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II 
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Category 



Examples 



DISCIPLINE 




A. 


Form 






Verbal-direct 


"Sit down." "Don't do that." 




Verbal- indirect 


"Please write the word." "Why don't 
you put your books away." 




Auditory 


Clapping the hands, striking the piano 




Visual 


The evil eye 




Physical 


Holding, touching, etc. 


B. 


Amount 


How many discipline instances during any 
one visit. 


C. 


Tone 


Must the class be completely silent. 
How much noise is allowed. 


D. 


Consistency 


Is the teacher consistent with her rules 
and enforcing them? 


E. 


Student pressure 


Are there occasions when students discipline 
or assist the teacher in this area by 



bringing pressure upon others. "Sssh, be 
quiet. " 



PHYSICAL organization OF CLASSROOM 

A. Characteristics 

B. Room divisions 

C. Interaction 

STUDENT INTERACTION 

A. Opportunity 

B. Characteristics 



Straight rows, tables, clusters of two 
and three desks. 

Are their study areas, work areas, hobby 
areas, reading areas, etc. 

Does room organization assist teacher- 
student and student-student interaction. 



Does the seating, ’ lessons , and assignments 
allow or encourage interaction. Learning 
groups, work groups, teacher's assistants. 

Describe interactions „ Students selecting 
one another to write spelling words on 
board, or to clean the desks, etc. 



^Degree of project student's "mix." Do they choose others, are they 
aggressive, moderate, or retiring in their interactions. 
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CLASSROOM OBSERVATION AND WRITTEN REPORTS: 
INSTRUCTIONS FOR OBSERVERS 



INTRODUCTION - 

In order for us to raost effectively use your observations of classrooms, it will 
be necessary for us to have several kinds of reports which will reflect, in a variety 
of ways, the teacher and child behaviors which you have observed in the classes 
assigned to youo These reports must be detailed enough and must Include sufficient 
affect so that other readers can read a series of reports and rate them in ways 
similar to the ways in which you will be requested to rank and rate the various 
classes that you are observing. This does not call for the” suppression of your 
biases, but rather the ready admission of them and explicit attempts to distinguish 
between those behaviors' which you take a liking to as differentiated from those 
behaviors which you think are of high quality. This means that you have not only 
to observe and report what you see, but also to assimilate what you 'see into the 
working model that is represented by your ideas, feelings, and experiences. We 
shall bring together the various models of the several observers into an Integrated 
framework which is controlled partially by the outline which was distributed and, 
further, by a series of scales which will be presented to you after you have 
concluded your observations. 

The process of abstracting from classroom behaviors to your" observations, and 
then to your written reports and then, still further, to a series of relevant scales 
is a difficult one which will depend on die kinds and degrees of differences that are 
found between the various classes that you observeo “Difficulty is, at the same time, 
a function of the differences that exist within any one class over a period of time. 
The process that is being constructed will give a more or less clear indication of 
whether classes are descrlbably and meaningfully different and,- to a lesser extent, 
the degree of differences between these classes. The reliability of" the process will 
depend upon the clarity and comprehensiveness of the written reports. It la 
necessary both to be able to carefully describe the classes that we see' as well as 
to make some clear statements about how equivlcal or unequlvlcal the system of 
measurement is when it is put to a fair testo In this case* the- tests will Include 
the observations of classes by different observers as well as the ratings of the 
classes by individuals who have not seen them, but who have access to the written 
reports o 
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OUTLINE FOR CLASSROOM OBSERVATION 

This outline, which was distributed to each observer, is not to be used as a 
checklist or as an observational guide o Rather, it should be used in the following 
way; observers should read and reread it carefully so that they are quite familiar 
with the various categories and sub-categories that describe a more or less all : 
Inclusive listing of behavioral possibilities in classroom situations e The outline 
does not represent a mutually exclusive system nor does it cover the detail which 
would bring it so much closer to the classroom situatlono Observers should be quite 
familiar with it, but they should not actively use it during the course of their 
observations. After completing process reports, they should refer back to the outline 
in order to sensitize them to the kinds of Information they are getting and the 
behaviors and situations which they should attend to on future visits to the class. 

The outline will be referred to again when the summary report is discussed below. 



PROCESS REPORTS 

These should Include a detailed statement of everything that is observed in the 
classroom including the behaviors of - the teacher' and children, the iJhyslcal 
characteristics of the classroom, the materials that are used and any other 
observations which are pertinent to discussing the class ^ These reports are to 
be thought of as the total of the observer/class interaction and they should not 
exclude the observer and his feelings from the report o 

Observers will differ in the way in which they construct this process report, 
but the end result should be pretty much the same. Some of you will take notes as 
you are observing the class, others should write out a detailed report immediately 
after you leave the class, still others might 'develop a system for sketching out their 
observations so that they can then be transcribed into a running commentary describing 
what was seen and how it was seen^ 

Theile process reports are the raw materials for everything that follows and .a 
single report should be made out for every observation of the class. Therefore, each 
observer will have at least two and preferably three process reports on‘ each class 
that they observe. 

It is hoped that these reports will not simply be a rather dry chronological 
listing of everything that happens but that they will include appropriate adjectives 
and interpretations that are a part of the observational process. The total 
interpretation of a given teacher and classroom will come in a later report. Vifhat 
we arc interested in here are the more minute interpretations of the specific 
behaviors that are observed. Although we are not specifically attending to 
fragmentary quantitative questions such as how many times a given child is repri- 
manded or how often the teacher talks opposed to how often the children talk. But 
we should be quite aware of duration and quantity and appropriate notes should be 
made about persistent kinds of behaviors that take place. 

The process reports will be used in two wayss in the first place they wlli. be 
used by independent readers who will make judgements about the classes from reading 
these reports; in the second place, they will be used to document the findings of 

survey and relevant parts of these reports will be abstracted and integrated 
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Into a total report of all classes o In both uses of the process reports It is 
necessary to have writing that is provocative and comprehensive and that projects 
the reader into the classroom so that he gets a feeling for what is taking place 
and how it is taking place 



ANALYTICAL REPORTS 

There should be an analytical report for each visit to a classroome This 
report represents the observer's explanation and synthesis of what he has seen. 

It can draw uponthe material from the process report but it is not an observational 
report as such but rather a critical appraisal of the classroom for the period of 
time that was observed. If there is no substantial difference between several' 
process reports, it is possible to combine several of these into one analytical 
report o However, in general, there will be a separate analytical report for each 
process report o 

The analytical report should refer back to the outline and should assess which 
parts of the outline are most relevant for the class under consideration, and what 
kinds of Information are not readily obtainable either because of the structure of 
the class or because of the accident of having observed a particular kind of class 
or a particular segment of the curricular > 



SUMMARY INTERPRETATION 

There will be one summary interpretation for each teacher that you observe. 

This will draw upon the several process reports and analytical reports and it should 
integrate all of the material that you have in your possession q This summary report 
should have two sections to it: first, an open-ended judgemental and inferential 

report describing the essential of the observed behavior of the period of two or 
three observational periods It should be completely openended (projective) in that 
you are free to draw on any material that vou have in any of the visits and you 
should underline freely as you see fit > The second part of the summary report should 
closely follow the outline and should comment on each of its major sections o If there 
are many omissions here then it should be clear that you have not observed the class 
either a sufficient number of times or sufficiently long enough on any one timeo We 
continually have to address ourselves to the question of whether we have observed 
behaviors which make any particular class comparable to other classes » 

Classroom observation is continually plagued by the lack of comparability of 
datao In one class a teacher may do a large amount of talking and it might be 
considered to be extremely important in assessing her efiectiveness Another teacher 
may also do a lot of talking but it might be trivial compared to other behaviors 
which she displays in her work with children This means that the problem of 
describing and evaluating teachers has to consider more and less effective behaviors 
as well as behaviors which are not appli'^.abie in an assessment of effectiveness . 

Somewhere along the line, we must make judgements which stem from our 
descriptions and which say something meaningful about the degree and kind of impact 
a particular teacher might have We must obtain a sufficient amount of material on 
teachers to make judgements about how effective they are with respect to the teaching 
of academic subject matter, of providing an environment for individual self- 
determination, and encouraging appropriate inter-personal relationships between the 
teacher and the children and between the children o 
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Scales for Rating Elementary School ' Classes* 
1. Involvement and Interest of children 



Indifference 

Apathy 



2. Purposeful behavior of class 
Aimless 



Wandering 

3. Source of direction of academic activities 
Teacher 



4. Nature of control over behavior 



Coercion 

Threat 



5. Effectiveness of behavioral controls 



None,. 



Clas8'*T>mr t>f control 

6. Quality of presentation of subject or materials 



Pedestrian,. 

Routine 



^Curiosity, 

Absorption 



JDlrect 

Responsive 



Child 



(Teacher) 



Trust , 
Respect 



(Teacher) 



.Complete, 

Class well controlled 
(Teacher) 

jCreatlvlty 
Variety , 

Innovation 



7. Dlfferentation of instruction 



(Teacher) 



Monolithic, flighly differentiated. 

Uniformity Individually discriminate 

8* Teacher reaction to classroom situation (Teacher) 



Unhapp y Indi f f erent 

Hostile 

9o Reinforcement of behavior of children 



Happy, 

Involvement with children 
Obvious ^njoymenj 



10 . 



Not apparent 




Freauent 


Nature of reinforcement 




(Teacher) 


Negative. 


Bribery . . 


Positive. 


Punitive, 




Approval, 


Threatening 




Encouragement 



*A11 teachers observed by a given rater are to be sorted into five categories so that 
two-thirds of the teachers are in categories 2,3, and 4; one-third are to be 1 and 5.. 
Category 1 is the left hand side of each scale and category 1 to the right hand side. 
Category 3 Is an Intermediate ^category. 
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VARIABLES FOR OBSERVATIONAL SCHEDULES 
(WITH SELECTED REFERENCES) 

Variable Types 



Obiect 


Independent 

Characteristics 


Dependent 

. Behavioral 


Curricular 


Pupil 


School background 
Placement procedure, 
2 

Diagnostic Informa- 
tion: 2, Aptitude 
Achievement , 
Personality 
Family-Home 


Problem solving, 2 I 

Motivation, 2 I 

Attention , 1 I 

Curiosity, 1 I 

Activity , 2 1 

Origination I 

Mobility 1 

'Participation I 

Disruption I 

Individuality , 2 I 

Pupil-Pupil Inter- 1 

action , 2 | 

Sociometric variables I 


Grouping, 2 
Getting help, 1 
Indep-^.ndent 
Activltlty, 1,2 


Teacher 


Education 

Experience 

Age 

Sex 

Certlflcatlon(s) 
Professional Or- 
ganizations and 
Journals 
Attitudes , 3 


Preparation, 2 I 

Direction, 2 I 

Presentation Variety,! I 

Sequence, 2 I 

Verbal-Nonverbal I 

Management-Discipline 
Empathy-Support-Humor , 1 
Evaluation-Criticism, 2 
Reinforcement-Rewards, 1 


Use of curriculum 
guide, 2 

Textbooks, work- 
books, 2 

Teacher-prepared 
materials , 2 
Evaluation-Reports , 
2 


Pupil- 

Teacher 

Inter^=* 

action 


Not applicable 


Direction-Initiative, 1 
Social Organization- 
Teacher or pupil cen- 
tered, 1 
Delegation of 
responsibility 


Differentiation, 

1,2 

As related to 
1 content and 

1 procedure 


Classroom 


Demographic 
Location-Type of 
community 
Size 

Equipment 

Supervision- 

reported 

Level 


Supervision-observed 
Climate, 1 
Routines, 2 
Discussion, 2 
Competition, 2 
Order-Disorder, 2 


1 Content, 1: 

1 Academic-Vocational- 

1 Crafts-Social- 

1 physical and 

1 recreational 

Subject or project 
1 Consult ants-muslc, 

1 art, physical 

1 education, 2 



1, Classroom Observation Code Digest (Cornell, Lindrall, Sarpe, 1952) 

2o Schedule for observing special class for mentally retarded children 

(Blatt, 1963) . 

3, Minnesota Teacher Attitude Inventory (Cook, Leeds, and Collls, 1951) 



